Multiple Imputation to Deal with Missing Clinical Data in Rheumatologic Surveys: an Application in the WHO-ILAR COPCORD Study in Iran
نویسندگان
چکیده
BACKGROUND The aim of the article is demonstrating an application of multiple imputation (MI) for handling missing clinical data in the setting of rheumatologic surveys using data derived from 10291 people participating in the first phase of the Community Oriented Program for Control of Rheumatic Disorders (COPCORD) in Iran. METHODS Five data subsets were produced from the original data set. Certain demographics were selected as complete variables. In each subset, we created a univariate pattern of missingness for knee osteoarthritis status as the outcome variable (disease) using different mechanisms and percentages. The crude disease proportion and its standard error were estimated separately for each complete data set to be used as true (baseline) values for percent bias calculation. The parameters of interest were also estimated for each incomplete data subset using two approaches to deal with missing data including complete case analysis (CCA) and MI with various imputation numbers. The two approaches were compared using appropriate analysis of variance. RESULTS With CCA, percent bias associated with missing data was 8.67 (95% CI: 7.81-9.53) for the proportion and 13.67 (95% CI: 12.60-14.74) for the standard error. However, they were 6.42 (95% CI: 5.56-7.29) and 10.04 (95% CI: 8.97-11.11), respectively using the MI method (M=15). Percent bias in estimating disease proportion and its standard error was significantly lower in missing data analysis using MI compared with CCA (P< 0.05). CONCLUSION To estimate the prevalence of rheumatic disorders such as knee osteoarthritis, applying MI using available demographics is superior to CCA.
منابع مشابه
Estimating the prevalence and disease characteristics of rheumatoid arthritis in Tehran: A WHO -ILAR COPCORD Study (from Iran COPCORD study, Urban Study stage 1)
Background :To estimate the prevalence and characteristics of Rheumatoid Arthritis (RA) in an urban area of Tehran. Methods : A total of 50 clusters were randomly selected in Tehran and 10291 subjects completed the COPCORD Core Questionnaire during 2004 and 2005. Patients with rheumatic complaints were examined and diagnosed by subspecialty fellows in rheumatology. Laboratory and radiology ...
متن کاملچند رویکرد برخورد با مقادیر گمشده متغیرهای کمی و بررسی اثر آنها بر نتایج حاصل از یک کارآزمایی بالینی
Background and Objectives: A major challenge that affects the longitudinal studies is the problem of missing data. Missing in the data may result in the loss of part of the information which reduces the accuracy of the estimator and obtain the results will be biased and inaccurate. Therefore, it is necessary to evaluate the missing data mechanism from a longitudinal research and to consider thi...
متن کاملAccuracy evaluation of different statistical and geostatistical censored data imputation approaches (Case study: Sari Gunay gold deposit)
Most of the geochemical datasets include missing data with different portions and this may cause a significant problem in geostatistical modeling or multivariate analysis of the data. Therefore, it is common to impute the missing data in most of geochemical studies. In this study, three approaches called half detection (HD), multiple imputation (MI), and the cosimulation based on Markov model 2...
متن کاملSelection of Variables that Influence Drug Injection in Prison: Comparison of Methods with Multiple Imputed Data Sets
Background: Prisoners, compared to the general population, are at greater risk of infection. Drug injection is the main route of HIV transmission, in particular in Iran. What would be of interest is to determine variables that govern drug injection among prisoners. However, one of the issues that challenge model building is incomplete national data sets. In this paper, we addressed the process ...
متن کاملکاربرد جای گذاری چندگانه در تحقیقات پزشکی و اپیدمیولوژی
Data missing, which occurs for different reasons, is an unavoidable problem in epidemiological studies. It is quite widespread and, therefore, it is considered as a challenge in research design and data analysis by many methodologists. Complete case analysis is often used in studies with missing data however, this approach may result in inaccurate estimates and inferences due to bias associated...
متن کامل